Fix `segmentation.MeanIoU` #2698

vkinakh · 2024-08-21T16:26:12Z

What does this PR do?

Fixes #2558
Fixes a few typos

Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

This PR fixes the segmentation.MeanIoU metric.

Issues

when the metric is updated via calling self.update, the self.score is accumulated with each call. Then when self.compute is called, accumulated metric is returned
when the metric is updated via calling self.forward, metric works correct

It is happens because when MeanIoU is updated via self.forward, it calls self._reduce_states method, which updates score using the following formula:

reduced = ((self._update_count - 1) * global_state + local_state).float() / self._update_count

where global_state is the score accumulated over previous steps and local_state is the score on current batch.

Solution

To solve this issue:

use sum reduce function for self.score
add state num_batches to keep number of processed batches
add increment of num_batches in every self.update call
in self.compute return sum of scores divided by number of processed batches

📚 Documentation preview 📚: https://torchmetrics--2698.org.readthedocs.build/en/2698/

codecov · 2024-08-21T20:27:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69%. Comparing base (4ce2278) to head (162761d).
Report is 1 commits behind head on master.

Additional details and impacted files

@@          Coverage Diff           @@
##           master   #2698   +/-   ##
======================================
  Coverage      69%     69%           
======================================
  Files         316     316           
  Lines       17907   17909    +2     
======================================
+ Hits        12333   12335    +2     
  Misses       5574    5574

Borda

great work, can we please add a test to prevent this issue in the future?

src/torchmetrics/segmentation/mean_iou.py

vkinakh · 2024-08-24T22:35:56Z

I have also noticed that names and descriptions in test for GeneralizedDiceScore are just copy-paste of MeanIoU tests. Tests are correct, but names and descriptions are incorrect and misleading

Should I create a new issue?

torchmetrics/tests/unittests/segmentation/test_generalized_dice_score.py

Lines 66 to 108 in 24e99c5

    
           class TestMeanIoU(MetricTester): 
        
               """Test class for `MeanIoU` metric.""" 
        
               @pytest.mark.parametrize("ddp", [pytest.param(True, marks=pytest.mark.DDP), False]) 
        
               def test_mean_iou_class(self, preds, target, input_format, include_background, ddp): 
        
                   """Test class implementation of metric.""" 
        
                   self.run_class_metric_test( 
        
                       ddp=ddp, 
        
                       preds=preds, 
        
                       target=target, 
        
                       metric_class=GeneralizedDiceScore, 
        
                       reference_metric=partial( 
        
                           _reference_generalized_dice, 
        
                           input_format=input_format, 
        
                           include_background=include_background, 
        
                           reduce=True, 
        
                       ), 
        
                       metric_args={ 
        
                           "num_classes": NUM_CLASSES, 
        
                           "include_background": include_background, 
        
                           "input_format": input_format, 
        
                       }, 
        
                   ) 
        
               def test_mean_iou_functional(self, preds, target, input_format, include_background): 
        
                   """Test functional implementation of metric.""" 
        
                   self.run_functional_metric_test( 
        
                       preds=preds, 
        
                       target=target, 
        
                       metric_functional=generalized_dice_score, 
        
                       reference_metric=partial( 
        
                           _reference_generalized_dice, 
        
                           input_format=input_format, 
        
                           include_background=include_background, 
        
                           reduce=False, 
        
                       ), 
        
                       metric_args={ 
        
                           "num_classes": NUM_CLASSES, 
        
                           "include_background": include_background, 
        
                           "per_class": False, 
        
                           "input_format": input_format, 
        
                       }, 
        
                   )

Borda · 2024-09-02T09:26:57Z

I have also noticed that names and descriptions in test for GeneralizedDiceScore are just copy-paste of MeanIoU tests.

that was a typo, fixing it in #2709

SkafteNicki · 2024-09-03T10:29:20Z

Added generalized testing that forward/update works as expected (comment #2698 (comment)) in commit 3d3d7b5. Hopefully, this check should pass for all other metrics, else lets move this to another PR.

vkinakh · 2024-09-03T12:14:13Z

Added generalized testing that forward/update works as expected (comment #2698 (comment)) in commit 3d3d7b5. Hopefully, this check should pass for all other metrics, else lets move this to another PR.

I suggest adding a test case to verify the metric aggregation behavior: the aggregated metric of a batch with 2*N predictions should match the aggregated metrics from two separate updates, each with N predictions. Are there any metrics that are not expected to behave this way?

- use sum reduce function for score - add state `num_batches` to keep number of processed batches - add increment of `num_batches` in every `update` call - in `compute` return sum of scores divided by number of processed batches

- use sum reduce function for score - add state `num_batches` to keep number of processed batches - add increment of `num_batches` in every `update` call - in `compute` return sum of scores divided by number of processed batches --------- Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> (cherry picked from commit cb1ab37)

vkinakh requested review from SkafteNicki, justusschock, Borda, lantiga and stancld as code owners August 21, 2024 16:26

vkinakh closed this Aug 21, 2024

vkinakh reopened this Aug 21, 2024

vkinakh marked this pull request as draft August 21, 2024 16:42

vkinakh marked this pull request as ready for review August 21, 2024 16:59

Borda changed the title ~~Fix segmentation.MeanIoU (#2558)~~ Fix segmentation.MeanIoU Aug 22, 2024

Borda reviewed Aug 22, 2024

View reviewed changes

src/torchmetrics/segmentation/mean_iou.py Show resolved Hide resolved

Borda added the bug / fix Something isn't working label Aug 28, 2024

Borda self-requested a review September 3, 2024 10:02

SkafteNicki added this to the v1.4.x milestone Sep 3, 2024

SkafteNicki approved these changes Sep 3, 2024

View reviewed changes

mergify bot added the ready label Sep 4, 2024

Borda approved these changes Sep 6, 2024

View reviewed changes

Borda enabled auto-merge (squash) September 6, 2024 09:05

mergify bot added has conflicts and removed has conflicts labels Sep 9, 2024

justusschock approved these changes Sep 10, 2024

View reviewed changes

mergify bot added has conflicts and removed has conflicts labels Sep 10, 2024

Borda force-pushed the master branch from 96ceda0 to f12e7af Compare September 11, 2024 15:10

mergify bot added the has conflicts label Sep 11, 2024

vkinakh and others added 6 commits September 11, 2024 17:16

Fix MeanIoU (Lightning-AI#2558)

ef88d97

- use sum reduce function for score - add state `num_batches` to keep number of processed batches - add increment of `num_batches` in every `update` call - in `compute` return sum of scores divided by number of processed batches

Fix typos

d4d3e05

Update compute description

b686569

changelog

2d0fdc3

add general testing

c7e722b

fix dict case

640c664

Borda force-pushed the fix/segmentation-meaniou branch from 87f2d8e to 86ca3f0 Compare September 11, 2024 15:16

github-actions bot added topic: Audio topic: Nominal labels Sep 11, 2024

mergify bot removed the has conflicts label Sep 11, 2024

Borda force-pushed the fix/segmentation-meaniou branch from 86ca3f0 to 640c664 Compare September 11, 2024 15:17

github-actions bot removed topic: Audio topic: Nominal labels Sep 11, 2024

Borda and others added 2 commits September 11, 2024 22:03

Merge branch 'master' into fix/segmentation-meaniou

2d5af5e

Merge branch 'master' into fix/segmentation-meaniou

162761d

Borda disabled auto-merge September 11, 2024 22:10

Borda merged commit cb1ab37 into Lightning-AI:master Sep 11, 2024
52 of 57 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `segmentation.MeanIoU` #2698

Fix `segmentation.MeanIoU` #2698

vkinakh commented Aug 21, 2024 •

edited

Loading

codecov bot commented Aug 21, 2024 •

edited

Loading

Borda left a comment

vkinakh commented Aug 24, 2024

Borda commented Sep 2, 2024

SkafteNicki commented Sep 3, 2024

vkinakh commented Sep 3, 2024

Fix segmentation.MeanIoU #2698

Fix segmentation.MeanIoU #2698

Conversation

vkinakh commented Aug 21, 2024 • edited Loading

What does this PR do?

Issues

Solution

codecov bot commented Aug 21, 2024 • edited Loading

Codecov Report

Borda left a comment

Choose a reason for hiding this comment

vkinakh commented Aug 24, 2024

Borda commented Sep 2, 2024

SkafteNicki commented Sep 3, 2024

vkinakh commented Sep 3, 2024

Fix `segmentation.MeanIoU` #2698

Fix `segmentation.MeanIoU` #2698

vkinakh commented Aug 21, 2024 •

edited

Loading

codecov bot commented Aug 21, 2024 •

edited

Loading